home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Fritz: All Fritz
/
All Fritz.zip
/
All Fritz
/
FILES
/
GAME_CGA
/
CRIBBAGE.LZH
/
HALSCRIB.TXT
< prev
next >
Wrap
Text File
|
1987-10-10
|
17KB
|
463 lines
HALSCRIB (c) 1987 Hal Mueller Page 1 of 7
The paper consists of the following sections:
1. Overview and approach of other computerized classic games
2. Elements and Rules of Cribbage
3. Optimal Strategy selection and deviation therefrom
employed during a game by the computer and the human
opponent
4. Preliminary results against human opponents of varying
skill levels
5. Summary and Conclusions
6. Further Research
1. Other Computerized Classic Games
A number of 2-person zero-sum games can be broadly
classified into those where
i. all information is available to both players
ii. some information is available to both players
iii. the element of chance plays no role whatsoever
iv. the element of chance plays some role in the outcome
v. the margin of victory may double the winner's payoff
Games such as CHESS, GO, and REVERSI/OTHELLO belong to categories
i. and iii.
POKER belongs to ii and iv.
BACKGAMMON belongs to categories i, iv, and v.
CRIBBAGE and DOMINOES belong to categories ii, iv, and v.
Typical move selection heuristics for Chess-playing programs
include mini-max search (including alpha-beta pruning), search
for "killer" moves, and iterative deepening based on an
evaluation function. The depth and width of the search
materially affect the strength of the program.
In Go, the size of the search tree precludes the approach taken
in Chess. Pattern-matching heuristics in conjunction with
massive parallel processing probably will result in GO-playing
programs that approach that of a human expert.
HALSCRIB (c) 1987 Hal Mueller Page 2 of 7
Work done in computer Poker indicates a weakness on the reliance
on expectations and the use of measurements of the opponents'
past performance for judging "Bluffing" situations. Programs
written by Findler et al implemented different playing
strategies, Mathematically Fair (static strategy - decides when
and what to bet by equating expected values of wins and losses -
bets strictly according to the odds), Statistically Fair
(dynamic strategy - includes a learning component that identifies
opponents' extent and frequency of bluffing), and several others
(RH Player, Bayesian Player, Pattern Recognition Player, Advice
Taker and Inquirer, Quasi-Optimizer Player, etc).
Backgammon programs by H. Berliner, Carnegie-Mellon, utilize a
smoothing approach to strategy variation as dictated by the
relative status of the players' remaining men. He classified
different backgammon positions such that the evaluation space
provided for increased importance of particular features. He
controlled these transitions (dependent on other features as
well) by "application coefficients" which vary slowly and
smoothly avoiding the rough boundaries which exist between the
different classified positions. (This approach is called SNAC,
for smoothness, nonlinearity, and application coefficients).
Most Cribbage games encountered by the author follow a static
strategy based on heuristics, regardless of the relative status
of the players' scores. During the selection of the discards,
and the play of the hands, several opportunities exist to vary
strategical decisions.
2. Elements and Rules of Cribbage
Cribbage is one of the most enjoyable card games ever developed
for two players. Not only does it provide scope for informed
strategic decisions, but also offers numerous ways of scoring,
both in the play and in the count of the hands and crib.
Initially, an ordinary deck of 52 playing cards is shuffled.
Each player "cuts" the deck, exposing a card. By mutual
agreement beforehand, the player cutting the lower (or higher) is
designated as the "dealer" for the first deal. The deal
alternates afterwards.
The dealer shuffles the deck randomly and then deals six cards
alternately, one at a time, face down to each player. The
players examine their hands, and decide which two cards to
discard face down in the crib. The crib belongs to the dealer.
After the crib has been formed, the non-dealer cuts the cards.
The dealer then turns face up the top card of the lower deck
which is then placed on top of the deck. This top card is the
up-card (or "starter").
HALSCRIB (c) 1987 Hal Mueller Page 3 of 7
The remaining four cards are retained and become revealed during
the play. The non-dealer begins the play by laying out one of
his cards face up and announcing its value. (Aces are 1, face-
cards are all 10, remaining cards have a value equal to their
pip-value). The dealer then lays out a card in the same fashion,
but announces the cumulative sum - previous cumulative sum plus
the value of his card.
The play alternates until the cumulative sum is 31 or neither
player can play a card without exceeding 31. The cumulative sum
is reset to 0 and play commences as before until all four cards
have been layed out by both players.
During the play either player can score points in a number of
ways. This process of obtaining points is called "pegging".
After the hand has been played, the non-dealer computes the value
of his hand in conjunction with the up-card which can be
included in the evaluation of each player's hand as well as the
crib.
After the non-dealer has scored his hand, the dealer scores his
hand and then his crib. The first player to reach 121 points
wins the game. If the other player has less than 91 points but
more than 60, the value of the game is doubled ("skunked"), or
the loser has less than 61 points the value of the game is
quadrupled ("double-skunk").
Points can be scored in a number of ways, some of which are
similar to the way in which points during pegging may be scored.
Scoring During Pegging:
15 or 31:
If a player makes the cumulative sum = 15 or = 31, then that
player pegs 2 points (increases his score by 2).
Pairs:
If a player plays a card of the same rank as the last one (in
play), then that player pegs 2 points.
If, after a pair has been made, a card of the same rank can be
legally played, then that player pegs 6 points (3-of-a-kind,
triplets, or Pairs Royal).
If, after a 3-of-a-kind has been made, a card of the same rank
can be legally played, then that player pegs 12 points (4-of-a-
kind, double pairs, or double Pairs Royal). (Can only happen
with cards whose rank is 7 or less.)
HALSCRIB (c) 1987 Hal Mueller Page 4 of 7
Sequences or Runs:
If 3 or more cards in uninterrupted numerical sequence, not
necessarily of the same suit nor in strict ascending or
descending sequence - eg 3-4-2-5 but not 3-4-6-2 have been
played, the player who completed the run or sequence pegs 1 point
for each card in the uninterrupted sequence. In the first
example the player who played the 2 pegs 3 points (run of 3) and
the player who played the 5 pegs 4 points (run of 4). Whereas in
the second example there are no runs at all, but the player on
turn could peg 5 points by playing a 5 (run of 5).
Last-card or Go:
If a player on turn is unable to play another card without
exceeding 31, he says Go and the other player must go on playing,
if he can, until he reaches 31 or can not play another card
without exceeding 31 himself. If he is also unable to play, he
also announces Go and pegs 1 point. The last card played scores
1 point unless it makes the cumulative sum = 31, in which case
only 2 points are pegged.
Scoring of Hands and Crib:
After pegging has been completed, the non-dealer places his cards
face up and counts his hand including the up-card and makes as
many scoring combinations of 15's, pairs, runs, flushes, as
possible.
15's:
Every combination of cards (with or without the up-card) that
totals 15 scores 2 points.
Pairs:
Every combination of cards that forms a pair (with or without the
up-card) scores 2 points.
Runs:
Every combination of cards (with or without the up-card) that
forms a run of 3 or more, regardless of suit, scores 1 point for
each card in the run.
Flushes:
Four cards of the same suit in the hand or all cards of the same
suit as the up-card scores 1 point for each card of the same
suit. (Score is 4 or 5).
However, the crib must have all cards of the same suit as the up-
card to score 5. (Score of 4 is not possible.)
HALSCRIB (c) 1987 Hal Mueller Page 5 of 7
His Nibs (His Nobs, Jack-in-the-Hand or Crib):
One point is scored for having a Jack in the hand or in the crib
of the same suit as the up-card.
Up-card:
If a Jack is the Up-card, the dealer pegs 2 points immediately
after the cut has been made.
Optimal Strategy Selection
Hand Selection from 6 Dealt
There are 15 different combinations of 4 cards (to be retained in
the hand) and 2 cards (to be discarded into the crib).
The program decides which of 7 different strategies it should adopt:
max of 4-card scores (ignoring up-card and crib)
max of max best 5-card score - min crib | dealer is opp
max of max best 5-card score + max crib | dealer is comp
max of expected 5-card score - expected crib | dealer is opp
max of expected 5-card score + expected crib | dealer is comp
max of expected 5-card score - max crib | dealer is opp
max of expected 5-card score + expected crib | dealer is comp
The strategy selected is dependent on the score and the deal.
Whenever the difference in score between the two players is less
than 16, the strategy it adopts is the "optimal" (expected hand and
expected crib are utilized). This strategy is "optimal" in the
sense that the player will maximize his points, on average, over a
large number of plays of a particular holding.
If it is behind by more than 15 points it adopts a "risky" strategy.
(Chooses max of best 5-card hand + max crib or - min crib). This
strategy is risky because it depends on obtaining the most
favourable up-card. This up-card would not, in general, be the most
likely.
If it is ahead by more than 15 points it adopts a "safe" strategy.
(Chooses max of expected hand + expected crib or - max crib). If
the computer is the dealer, its strategy is similar to the optimal
strategy (ties are broken differently). However, if the opponent is
the dealer, then it will choose its discards so as to minimize their
benefit to the opponent given the most undesirable up-card. Again,
this unfavourable up-card would not, in general, be the most likely.
If it can reach game, it adopts a "safe" strategy.
If the opponent counts first and is expected to reach 121, then it
adopts a "risky" strategy.
HALSCRIB (c) 1987 Hal Mueller Page 6 of 7
If it counts first and the opponent is expected to reach 121, then
it adopts a "safe" strategy if it has less than 91 and can exceed 90
based on its hand (to avoid a skunk) and similarly if it has less
than 61 and can exceed 60 (to avoid a double-skunk).
If it expects to reach game and the opponent counts first and has
less than 91 (or 61) and expects the opponent to avoid the skunk (or
double-skunk), then it adopts a "risky" strategy.
During the play of the hand, the current point differential is used
in selecting its strategy: safe, optimal, risky.
For each possible legal card, it evaluates the expected pegging
score (using current probabilities), the maximum possible net
pegging score (ignoring probabilities), and the maximum possible
opponent pegging score (ignoring probabilities). The optimal
strategy selects the max expected pegging score. The safe strategy
selects the min of opponent max pegging score. The risky strategy
selects the max of net pegging score.
If it is on lead and there is no pair in its holding, it will select
that card which maximizes its expected value. If there is a pair,
the expected value of always leading a card from a pair is greater
than from any non-pair card playing against an unbiased opponent.
Because a human opponent can easily detect this bias, it will forego
the mathematically sound play of the pair and choose an "inferior"
card between 50-80% of the time. (The percentage is dependent on
the point differential). Similarly, if it is on play after the
opening lead, it will select that card which maximizes its expected
value. Because a bias in always pairing if possible also can be
detected easily, it will forego the mathematically optimum play and
select an "inferior" card between 35-65% of the time.
These percentages were derived from the following "Pay-off Matrix"
and then adjusted to reflect actual frequencies of applicability.
Table I (simplistic) Table II (probabilistic)
|| 4 -2 || || .40 -.23 ||
|| -2 2 || || -.25 .24 ||
Optimal strategy is to lead a pair 40% (44%) of the time - pair the
first card 40% (42%) of the time. These ratios are adjusted to
allow for those situations when no choice is possible (no pair
available to lead - no card available to pair).
Preliminary Results
During the initial trials of the program against myself and
knowledgable cribbage players, it became obvious that when on lead
it invariably led from a small pair whenever it could. This
selection is optimal if there is no bias in the play of its opponent
or if the opponent also plays in "optimal" fashion. However, a
human opponent can take advantage of this prediliction and avoid
pairing cards immediately at some slight risk of not pegging
anything or being paired by the computer later in the play of the
hand.
HALSCRIB (c) 1987 Hal Mueller Page 7 of 7
A change was made whereby it deviated from the (biased) optimum, so
that when it had a pair, it made that selection 65% of the time in
"optimal" situations, 80% of the time in "safe" situations, and 50%
of the time in "risky" situations. (In "safe" situations, a pair by
the risk-taking opponent would allow it to increase its lead.) (In
"risky" situations, a cautious opponent would avoid pairing so as to
avoid having the computer peg 6 for 3-of-a-kind. It would increase
the possibility of it pairing its own card later in the play.)
This made the computer far less predictable and consequently
increased its playing strength.
Cumulative tally:
Before (biased optimum) - 20 wins by computer
31 wins by opponents
After (unbiased optimum) - 72 wins by computer
53 wins by opponents
Summary and Conclusions
Cribbage is a game where a knowledge of permutations and combinations
is essential to optimal play. A knowledge of the predictable
playing style of the opponent can materially affect the outcome of
an extended series of games.
Further Research
At present, the program adopts a "static" statistically optimum mixed
strategy when leading and pairing (see Pay-off Matrix) against all
opponents. The next step is to monitor the pegging-style of each
opponent and adjust its own pegging-style ratios accordingly. In
addition, it should use current probabilities in computing the value
of the pay-off matrix entries. The result would be a "dynamic" sta-
tistical optimal mixed strategy for the initial lead or initial
response to the opening lead.
How this program would fare against an opponent who uses "pure"
strategies only or against a "static" statistical mixed strategy in
a match of 15 games would be interesting.
BIBLIOGRAPHY and REFERENCES
Anderson, Douglas, All About Cribbage, Winchester Press, NYC, 1971
Berliner, Hans, Computer Backgammon, Scientific American June 1980
Findler, Nicholas V., Computer Poker, Scientific American July 1978
Levy, David, Computer Gamesmanship, Century Publishing, London, 1983